Project Title
INFO 526 - Summer 2024 - Final Project
Abstract
This topic was chosen as members of our team have extensive experience in or interest in crime data. The two datasets are from 2023 that were analyzed for this project. The first are violent crimes across Arizona by FBI jurisdiction, and the second are national property crimes by state. The result of the data analysis
Plot Setup
Introduction - dataset 1
Our team’s shared background in data analysis—and genuine curiosity about public safety—led us to dive into Arizona’s violent crime dataset. The data splits offenses into three overarching groups—Property, Persons, and Society—each of which contains numerous subcategories. To keep our focus sharp, we concentrated on those three main categories. We set out to map how violent crime varies across Arizona’s jurisdictions and to pinpoint which offense types dominate in different regions. In particular, we will analyze most frequently reported crimes.
Here are the two key research questions we’re tackling using the Arizona dataset: (1) Which cities recorded the highest per-capita crime rates in 2023, and which specific offenses drove those rates? (2)How do violent crime rates in Arizona compare between urban and rural areas?
We propose two hypotheses for our Arizona analysis: (1) Violent crime rates will be highest in jurisdictions with the greatest population density. (2) Among violent crime categories, property offenses will occur more frequently than society offenses, which in turn will exceed offenses against persons.
Justification of approach - data set1
This R script processes Arizona crime data by agency for 2023 and produces two key visualizations to communicate crime patterns across the state. The chosen analysis approaches were selected to address the imitations of the dataset while providing clear and interpretable visualizations.
The first plot is a geospatial map of Arizona cities by total crimes. We chose this foramt because it allows for an intuitive spatial comparison of crime totals in different geographic locations. Given the large number of cities in the dataset, this map format helps users quickly identify areas with higher crime activity by scaling the size of city points relative to total crime counts. The map uses a consistent color for all points to maintain clarity and avoid visual clutter. City names are labeled on the map, providing geographic context without overwhelming the visualization. The decision to size points based on total crime counts makes it easy to visually distinguish between cities with low, medium, and high crime levels while preserving the map’s readability.
The second plot is a bar Plot of the top 5 Arizona cities by total and type of crimes, to show a more detailed comparison. Given that there are too many cities to compare meaningfully in a single bar plot, narrowing the focus to the top five provides a more digestible comparison. The bar plot arranges cities in descending order, making it easy for viewers to immediately identify which cities experience the most crime. Additionally, each bar is color-coded according to the most common crime type in that city (e.g., Assault Offenses, Drug/Narcotic Offenses, Theft Offenses).
Code and Visualization - data set1
Discussion 1
RQ1: Which cities recorded the highest per-capita crime rates in 2023, and which specific offenses drove those rates?
The bar plot addresses this question, highlighting Mesa, Glendale, Tempe, Pima, and Scottsdale as the cities with the highest total crime counts in Arizona. Mesa leads with nearly 60,000 reported offenses in 2023, with Drug and Narcotic Offenses being the most common category. Glendale follows with almost 40,000 total crimes, primarily driven by Assault Offenses. Tempe, Pima, and Scottsdale each reported close to 30,000 offenses, with Larceny and Theft Offenses being the most prevalent in these cities.
In the geospatial map (first plot), the larger point sizes around the Phoenix metropolitan area—including these neighboring cities—visually demonstrate that this region experiences the highest concentration of crime activity in Arizona.
RQ2: How do violent crime rates in Arizona compare between urban and rural areas?
The geospatial map reveals that, as expected, major cities report the highest total crime counts, indicated by larger points clustered in urban areas like Phoenix, Mesa, and Glendale. While it was initially anticipated that crime would be concentrated exclusively in urban centers, the map also shows that some smaller cities by population, such as Safford and Camp Verde, recorded notable crime totals relative to their size.
Although violent crimes are generally more frequent in large metropolitan areas, this finding suggests that certain rural and suburban areas in Arizona also experience considerable crime rates, warranting attention when comparing crime patterns across different population centers in the state.
Introduction - dataset 2
Our team’s deep experience in data analysis—and our keen interest in property offenses—prompted us to examine the 2023 NIBRS property-crime figures for every U.S. state. This dataset breaks down four key offense types—Burglary, Larceny/Theft, Vehicle Theft, and Vandalism—providing both total incident counts and rates per 100,000 residents. To keep our study focused, we zeroed in on these core categories. First, we deployed a color–shaded U.S. map to illustrate how overall property-crime rates vary from state to state. Then, we used a grouped bar chart to compare the breakdown of property crime subtypes across the ten states with the highest total property crime counts. Through these visuals, we aim to highlight regional trends and identify which types of property crime are most common in the states most affected.
Our second dataset compiles 2023 property crime rates for every U.S. state, calculated both as total incidents and per 100,000 residents. By focusing on NIBRS’s four core property-crime categories—burglary, larceny/theft, motor-vehicle theft, and vandalism—we streamline our analysis to examine how state population size correlates with overall property-crime burden and to spotlight which states bear the highest rates. We chose to utilize the major crime property types verses using all property crime types. Statistical references are more useful to our visuals and analysis if they are grouped together.
Proposed Research Questions: (1) Which states stand out as property-crime hot spots, and how do their per-capita rates compare to the national average? (2) Across the top ten states by property-crime volume, which subtype dominates, and how does its share fluctuate from state to state?
Proposed Hypotheses: (1) States with large urban centers have property-crime rates per 100,000 residents that are higher than the national average, while states with predominantly rural populations have property-crime rates below the national average. (2) Within the top ten states by total property crimes, larceny/theft will constitute at least 60% of reported property offenses, while the proportions of vehicle theft and vandalism will differ significantly across states, reflecting underlying regional factors.
Justification of approach - data set 2
Our selection of methods and visuals is driven by a desire to balance big-picture perspective with targeted insight, guiding readers smoothly from state-level comparisons down to specific crime-type breakdowns.
First, we chose a choropleth map to leverage the brain’s natural ability to detect patterns in color. By shading each state according to its property-crime rate per 100,000 residents, the map instantly highlights geographic clusters—whether they’re sprawling urban corridors or economically stressed rural pockets—without overwhelming the audience with raw numbers. The clean legend and intuitive viridis palette direct attention to the darkest and lightest regions, making it clear where policy interventions or deeper investigation might be most warranted.
Next, the Cleveland dot plot ranks all fifty states from lowest to highest, using distinct colors for the bottom five, middle bulk, and top five jurisdictions. Adding dashed cut-off lines emphasizes the thresholds for our outlier groups, underscoring our focus on the extremes as potential case studies. This plot complements the map by quantifying each state’s distance from the national median—nuance that color alone can’t fully convey—while keeping the visual clean and data-ink efficient.
Finally, the grouped bar chart zeroes in on the ten states with the highest total property-crime volumes. Displaying burglary, larceny/theft, vehicle theft, and vandalism side by side, with percentage labels indicating each subtype’s share, allows us to unpack the composite factors behind those soaring totals. This layered strategy—from map to ranking to subtype breakdown—ensures readers see the forest, then the trees, and finally the leaves, enabling both macro- and micro-level interpretation. This approach shines a spotlight on state-by-state hotspots and outliers while peeling back the layers to show which property-crime categories drive each state’s overall rate—keeping our visuals laser-focused on the questions we set out to answer.
Code and Visualization - data set 2
The R script is structured into clearly labeled sections, each introduced with brief comments. It begins by loading the dataset and then builds three plot-ready data frames: one merged with map_data("state") for the choropleth, another with custom y-positions and group labels for the Cleveland dot plot, and a third in long format (with percentage shares) for the grouped bar chart. Every ggplot2 call incorporates custom scales, handpicked color palettes, informative titles, and refined themes—making the entire workflow reproducible and immediately understandable.
Discussion 2: Discussion of results is clear and correct, and it has some depth without begin excessively long. (5 points) Meredith
RQ1: Which states emerge as outliers in property-crime rates, and how do their per-capita figures compare?
Our choropleth map and Cleveland dot plot highlight the District of Columbia, New Mexico, and Washington as the clear high-rate outliers, whereas Idaho, New Hampshire, and Maine occupy the lowest positions on the spectrum (FBI, 2025). These disparities underscore how urban density, economic stress, and transient populations—such as DC’s daily commuter influx—can amplify opportunities for property offenses. In contrast, states with more rural character appear to benefit from tighter-knit communities and lower population concentrations, which likely suppress overall property-crime burdens (FBI, 2025).
RQ2: Within the ten states with the largest total property-crime counts, which subtype predominates, and how much variation exists?
Larceny/theft always accounts for a majority of the incidents across the highest-volume states, underscoring its dominant role in overall property-crime patterns (FBI, 2025). The other portions of vehicle theft and vandalism are completely inconsistent. Nevada and New Mexico account for elevated vandalism incidents. Colorado’s higher vehicle-theft rate may reflect its metropolitan parking dynamics (FBI, 2025). Each state has a unique set of circumstances to include poverty, population count, rural/suburban/urban status, etc. As a result, the type and amount of property crime will vary greatly across the United States.
Sources
Referenced to use across function: https://dplyr.tidyverse.org/reference/across.html
Colorblind friendly pallet: https://grafify.shenoylab.com/colour_palettes.html
Referenced for pretty breaks function to make spacing in arizona map readable: https://cran.r-project.org/web/packages/scales/index.html
Referenced for sf package and functions including centroid() to compute center points for cities: https://cran.r-project.org/web/packages/sf/index.html
Referenced for tigris package and functions: https://cran.r-project.org/web/packages/tigris/tigris.pdf
For data used: Federal Bureau of Investigation. (2025). National Incident-Based Reporting System 2023 property- crime extract files [Data set]. U.S. Department of Justice, Office of Justice Programs. https://www.ojp.gov/library/publications/national-incident-based-reporting-system-2023-extract-files